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ABSTRACT 

Existing  (ault-toierant  cdock  synchranizstion  prouxols  are  shown  to  result  from  refin¬ 
ing  a  single  dock  synchranizadou  paradigm.  In  that  pantfigm,  a  reliable  dree  sonree 
pciioifically  issues  messages  that  cause  processors  to  resynchroois  their  docks.  The 
reliable  time  source  is  approziznated  by  reatfing  all  docks  in  the  system  and  using  a 
convergence  function  to  compute  a  fault-tolerant  average  cf  the  values  read.  The 
performance  at  a  clock  synefaroniadon  algorithm  based  on  the  fmdigm  can  be 
quantified  in  terms  of  the  two  parameters  that  chancteria  the  behavior  of  the  con- 
vergenoe  funedon  used:  accuracy  and  predsian.  , 
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1.  Introdnction 


Certain  applications  reqtiire  synchronized  docks  at  the  processors  in  a  distributed  syv 
tern.  For  example,  the  accuracy  of  performance  statistics  computed  in  terms  of  elapsed  rime 
between  events  at  different  sites  depends  on  how  closely  the  docks  at  participating  sites  are 
synchronized.  Also,  timeouts  and  other  time-haaed  synchronization  schemes  (such  as  the 
state-machine  approach  [Lamport  84])  often  involve  delays  that  are  proportional  to  how 
closely  the  docks  at  partidpating  sites  are  synchronized. 

Even  if  we  could  start  all  processor  clocks  at  the  same  rime,  they  probably  would  not 
remain  synchronized  for  long.  Crystal  docks  found  in  today’s  processors  run  at  rates  that 
differ  by  as  much  as  10~^  seconds  per  second  from  real  time  and  thus  can  drift  apart  by  1 
second  every  10  days;  docks  baaed  on  power-line  frequency  can  drift  considerably  more  than 
this — when  used  as  a  time  base,  the  power  grid  in  the  Northeastern  United  States  typically 
drifts  4  to  6  seconds  from  teal  time  over  the  course  of  an  evening  [Mills  85].  Keeping  docks 
in  a  distributed  system  synchronized  arithout  appealing  to  a  single,  centralized,  rime  service 
requires  that  dock  values  be  exchanged  and  adjusted  periodically.  If  failures  can  result  in 
fatilty  processors  exhibiting  arbitrary  behavior,  then  the  protocol  has  the  additional  burden  of 
tolerating  erroneous  and  inconsistent  clock  values. 

This  paper  surveys  fault-tolerant  protocols  for  synchronizing  clocks  in  a  (Sstributed  sys¬ 
tem  where  faulty  processors  can  exhibit  arbitrary  behavior.  We  show  bow  existing  clock  syn> 
chronization  protocols  can  be  viewed  as  refinements  of  a  single  dock  synchronization  para* 
(figm.  That  paradigm  is  described  in  section  2.  In  section  3,  we  (fisana  properties  of  oonver* 
gence  functions,  the  central  component  of  a  dock  synchronization  protocoL  Technupjes  for 
reading  docks  across  a  computer-communicationa  network  are  described  in  section  4.  Section 
S  di3ciia.ses  how  agreement  protocols  can  improve  the  performanoe  of  a  convergence  function. 
Some  condusioos  and  related  work  appear  in  section  6.  An  appentfix  derives  bounds  on  the 
resynchronization  intervaL 

2.  Clock  Synchronization 

The  dock  at  a  correct  processor  p  can  be  viewed  as  implementing  in  hardware  a  mono- 
tonically  increasing,  continuous^  function  Cp  that  maps  a  real  time  /  to  a  dock  time  ep(t)  that, 
for  some  positive  constants  |s  and  p,  satisfies: 


‘Strictly  spealdsg,  is  an  ccBiifluoui  becaioe  it  «d»anca  ia  dscme  dcks.  Hoivever,  if  these  tida  hsppea 
frequenly  enough,  it  is  impossible  for  s  ptogran  ruBhag  oa  a  processor  p  to  ideatify  two  successive  real  timea  / 
aad  I'  where  c,(/)«c,(i')-  Therefore,  we  caa  treat  m  beiag  cmtManm. 


Initial  Value:  0  ^  c^(0)  ^ 


(2.1) 


Correct  Rate:  1-p  ^ 


h-h 


1+p  for  t]<t2. 


(2.2) 


Condition  (2.1)  asserts  that  Cp  is  initially  set  to  some  value  within  ^  of  the  real  time;  (2.2) 
asserts  that  the  drift  rate  of  Cp  is  within  p  of  1  clock-second  per  real-time  second. 

We  malm  no  assumpdons  about  the  behavior  of  docks  at  faulty  prtxsssors — not  even 
that  they  can  be  modeled  by  functions.  A  dock  on  a  faulty  processor  need  not  increase  as 
real  rime  passes  and  might  give  inaccurate  or  conflicting  information  when  it  is  read. 

A  dock  synchronization  protocol  implements  a  virtual  eiadt  Cp  at  each  processor  p.  Vir¬ 
tual  docks  at  any  correct  procesaots  p  and  ^  satisfy 

Synchronization:  |c4(/)— Cp(t)|  ^  &  for  all  /, 


(23) 


Rate:  1-p  s  ^  s  1-^p  forti</2. 

fj-fi 


(2.4) 


for  given  constants  i  and  p. 


If  a  reliable  rime  source  is  available,  then  dock  synchronization  is  simple.  The  reliable 
rime  source  periodically  broadcasts  the  correct  rime  and,  upon  receipt  of  such  a  broadcast, 
each  processor  adjusts  its  virtual  dock  accordingly.  Provided  the  broadcast  arrives  at  each 
processor  at  about  the  Mme  rime,  all  processors  will  adjust  their  clocks  so  that  (2  J)  is  satis¬ 
fied.  Provided  the  broadcast  is  done  frequently  enough,  processor  clocks  will  not  drift  too  far 
apart  in  the  interval  between  broadcasts,  so  (2.3)  will  be  maintained-  Provided  that  no  pro¬ 
cessor  has  to  adjust  its  dock  by  too  much  upon  receipt  of  a  broadcast,  the  adjustment  can  be 
spread  over  the  interval  that  follows  and  (2.4)  will  be  maintained.  We  have  only  to  imple¬ 
ment  the  reliable  rime  souree. 


The  reliable  rime  source  serves  two  functions  in  the  dock  synchronization  (Totocol  out¬ 
lined  above. 


RTSl:  It  periodically  generates  an  event  that  causes  every  processor  to  resynchronize 
its  clock  at  about  the  same  rime. 


RTS2:  It  provides  every  processor  with  a  dme  value  that  can  be  used  in  adjusting  that 
processor’s  local  dork.  If  each  processor  adjusts  its  clock  based  on  the  value  it 
receives  at  the  rime  that  value  is  received,  then  (2J)  will  hold. 

Note  diat  while  one  might  desire  that  a  reliable  time  source  always  be  able  to  provide  the 
correct  time,  RTSl  and  RTS2  merely  require  that  the  oonect  time  be  made  available 


periodically. 

Although  it  is  easy  to  satisfy  RTSl  and  RTS2  using  a  single  dock,  the  resulting  time 
souroe  in  not  likely  to  be  fault  tolerant.  Fortunately,  a  distributed  reliable  titne  source  that 
satisfies  RTSl  and  RTS2  and  is  fault-tolerant  can  be  constructed  when  approziniately  syn¬ 
chronized  docks  are  available.  RTSl  is  achieved  by  having  each  processor  attempt  resyn¬ 
chronization  when  virtual  docks  in  the  system  teach  a  certain  value.  RTS2  is  achieved  by 
having  each  processor  independently  read  all  the  other  docks  in  the  system  and  compute 
some  type  of  fault-tolerant  average  of  the  values  gathered. 

Adjusting  a  virtual  dock  Cp  can  be  viewed  as  simply  starting  another  virtual  clock  that 
runs  concurrently  with  the  old  one.  Thus,  after  the  adjustment,  p  starts  a  new  virtual 
dock  Cp.  rvfifif.  FIXp  to  be  the  adjustment  to  Cp,  processor  p’s  (hardware)  dock,  that  results 
in  Cp.  That  is, 

40)  -  cp(0  FlXj,. 

We  refer  to  Cp  as  a  superscripted  virtual  clock  to  (fisdnguiah  it  from  Cp. 

We  can  now  describe  the  dock  synchronization  protocol  outlined  above  for  a  processor  p 
in  a  distributed  system  consisting  of  S  processors. 

1  1;  FIX‘p  0; 

do  Csrrrer 

await  Next  Synchronization; 

Assume  real  time  is  now  tj-; 

Fa^p*^  :=  CF(4(/r).  c*Utr) . <U^r))  -  Cp^tj)-, 

i  <+l 

CF,  called  a  convergence  /unction,  implements  the  fault-tolerant  average  used  to  satisfy 
RTS2.  In  particular,  CF(cp(/T),  40r)*  — <  ^.Wr))  provides  the  value  of  the  reliable  time 
souroe  at  real  time  tj-. 

Three  important  things  about  this  protocol  remain  unspecified.  First,  there  is  the  imple¬ 
mentation  of  "await  Next  Synchroniation’*.  An  obvious  approach  uses  cj,: 

do  Cp{tf)d‘SejtSynch  -  skip  od 

where  NexiSynch  is  a  previously  agreed  on  time-  When  virtual  docks  at  correct  processors 
are  synchronized  to  within  S,  this  scheme  ensures  that  all  processors  resynduonize  for  the 
time  within  i  of  each  Other.  Another  implementation  of  “await  Next  Synchronization’’  is  for 
a  processor  to  broadcast  a  message  when  Cp(iiow)=NezsSyncA  and  resynchronize  when 
enough  such  messages  have  been  received.  The  details  cf  this  scheme,  which  is  baaed  on  a 
simple  form  of  agreement,  are  given  in  section  5. 


The  second  item  that  remains  unspediied  in  our  protocol  is  the  convergence  function 
CF.  Pmperries  and  ^■rattipW  of  convergence  functions  are  the  subject  of  sections  3  and  5. 
The  final  item  to  be  specified  in  our  protocol  is  how  one  processor  reads  the  virtual  clocks  at 
other  processors.  Two  ri»rhTn'qii«»a  for  this  are  disnissed  in  section  4. 

Different  chaices  for  the  three  unspecified  items  in  the  paradigm  result  in  different  clock 
synchronization  protocols.  The  chaices  covered  in  sections  3,  4,  and  S  permit  all  the  pub¬ 
lished  clock  synchronization  protocols  we  know  of  that  do  not  make  use  of  an  external  time 
source  to  be  viewed  in  terms  of  our  paradigm.  Thus,  the  paradigm  is  quite  general  and  pro¬ 
vides  a  vehicle  with  which  the  clock  synchronization  literature  can  be  surveyed. 

3.  Convergence  Fonctlona 

In  its  most  general  form,  a  convergence  function  CF  for  use  in  a  system  of  N  processors 
is  a  function  of  N-^  1  arguments.  The  first  argument  is  for  the  value  owned  by  the  processor 
invoking  CF-,  of  the  following  S  arguments  is  for  a  value  from  each  of  the  processors 
in  the  system.  This  that  when  a  processor  p  evaluates  CF,  the  same  value  will  appear 

in  two  argument  positions — the  first  and  the  p  + 1". 

For  a  function  CF  to  be  a  convergence  function,  it  must  exhibit  certain  properties.  First, 
hftraiwf.  the  relative  distributian  of  the  virtual  dock  values— and  not  their  magnitudes — 
should  matter  when  they  are  combined  to  implement  a  reliable  time  source,  CF  should  satisfy 

Translation  Invariance:  CF(Xj,+v,  x^+v,  ...,  xy+v)  =  CF{Xp,  x^,  ...,  xyj+v. 

Next,  we  require  that  when  CF  is  evaluated  by  two  different  processors  using  similar 
values  far  S-k  corresponding  arguments,  it  produces  values  that  are  doser  together  than  its 
arguments.  More  specifically,  for  CF  to  be  a  convergence  function  tbere  must  exist  a  con¬ 
stant  k  railed  the  fauU-toierance  degree  and  a  function  ir  nailed  the  precision.  The  fault- 
tolerance  degree  specifies  the  number  of  faulty  argument  values  that  can  be  talented  by  CF ; 
the  precision  spedfies  how  dose  together  values  can  be  brought  using  CF.  This  is  formalized 
by  tbe 

Precision  Enhancement  Property:  If  there  exist  values  S,  c,  and  indices  a^,  ...,  oy_;^ 

such  that  Xp=Xa_,  yt,=ya,  “d 

(a)  (maxl,y:  Ixa^-Xa^[)s8 

(b)  (maxi,  l^iJ^S-ki  lya,-ya,D^* 

(c)  (Vl:  IslsN-t:  lxa,-ya_l^e) 

then  |CF(xp,  xj  ...,  xy)  -  CF(y,,  ...,  y,y)  |  s  ^(8,  e). 

Conditions  (a)  and  (b)  <V»finf.  5  to  be  the  width  of  the  interval  containing  correct  values;  when 
using  CF  to  implement  a  reliable  time  source,  these  oonditians  are  satisfied  if  correct  virtual 


finfV*  are  synchronized  to  within  5  when  read  by  p  and  q.  Condition  (c)  stipulates  that 
conesponding  (correct)  arguments  to  CF  are  at  most  e  apart;  for  a  reliable  time  source,  this 
conditian  is  if  two  values  obtained  by  reatfing  the  same  virtual  clock  v  (real)  seconds 

span  do  not  rfiffer  by  more  than  y+c  as  a  result  of  drift.  The  Precision  Enhancement  Pro 
petty  states  that  in  order  for  CF  to  be  a  convergence  function,  two  evaluations  with  argu¬ 
ments  satisfying  (a) -(c)  must  produce  values  that  are  close— «t  most  ir(8,  e)  apart — even 
though  the  values  used  for  i  of  the  arguments  (presumably,  from  faulty  processors)  differ 
arbitrarily.  Thus,  provided  -ir(8,  c)  <  8,  CF  implements  a  time  source  that  furnishes  different 
processors  with  time  values  that  are  closer  fhan  the  least  synchronized  correct  virtual  docks. 

The  final  property  of  a  convergence  function  CF  requires  that  CF(xp,  x^,  ...,  xy)  is  not 
too  far  from  any  of  its  arguments  that  are  within  8  of  N—k  —  1  others. 

Accnracy  Preservation  Property:  For  values  x^,  X2,  ...  xy,  and  8,  and  indices  o^,  ..., 
oy-t  such  that  (max  /,  J:  jx^,  -  Xg^j)  s  8, 

(max  /,  J:  l^ij-^ff-k:  (x^,  -  CF(xa^,  x^,  ...,  xy)|)^o(8). 

Function  a  is  called  the  aeeuncy  of  CF.  When  CF  is  used  as  a  reliable  time  source,  provided 
a(8)  ^  8,  resynchronizing  a  clock  when  correct  clocks  are  no  more  than  8  apart  leaves  the 
new  clock  within  S  of  all  correct  docks. 

Ezamplca  of  Convergence  Fnnctioaa 

F-TSTTiplgi  of  functions  that  satisfy  the  three  properties  of  convergence  functiom  indude: 

Egocentric  Average:  CF^(xp,  xj ...,  xy)  is  the  average  of  all  arguments  Xi  through  xy 
that  are  no  more  than  8  from  x^. 

The  degree  k  of  fault  tolerance  for  CF^^  is  characterized  by  Sik-i- 1 »  y.  Predsiao  ir  is 

bounded  by  ir(8,  c)»'^^+c  wbere  /  is  the  number  of  arguments  that  (fiffer  in  the  two 

function  evaluations;  in  the  worst  case,  this  is  slightly  less  than  1+e.  Accuracy  is 
bounded  by  a(8)=48/3. 

Fast  Convergence  Algorithm:  CFircA(xp,  x^ ...,  xy)  is  the  average  of  all  arguments  x^ 
through  Xy  that  are  within  8  of  N-k  other  arguments. 

The  degree  k  at  fault  tolerance  for  CFfc/^  is  characterized  by  ik+l^S.  Predsian  ir  is 

bounded  by  rr(8,  where  / is  the  number  of  arguments  that  (fiffer  in  the  two 

function  evaluations;  in  the  worst  case,  this  is  28/3'^c.  The  accuracy  is  bounded  by 
a(8)-48/3. 

Fanl^tokrant  MUpoint:  CF^,^(xp,  x^ ...,  xy)  is  the  midpcnnt  of  arguments  x^  through 
Xy  after  the  k  highest  and  k  lowest  values  have  been  (fiacarded. 


The  degree  k  of  fault  toLeraaoe  for  CF£^  is  characterized  by  3ik  +  l  =  y.  Precision  is 
bounded  by  ir(8,  e)=8/2+e;  accuracy  by  a(5)=8. 

Faiilt*toleraBt  ATcrage:  CF^^^(xp,  ....  xv)  is  the  average  of  arguments  x^  through  xy 

after  the  k  highest  and  Jk  lowest  values  have  been  discarded. 

The  degree  k  at  bult  toleranoe  for  is  characterized  by  3ikH- 1  =  Precision  rr  is 

bounded  by  •n'(8,  c)=  where /is  the  number  erf  arguments  that  differ  in  the  two 

function  evaluations;  in  the  worst  case,  this  is  slightly  less  than  1+e.  Accuracy  is 
bounded  by  a(8)=8. 

CF£^  was  first  proposed  and  ana'yized  in  [Lamport  &  MUliar-Smith  S5]  in  connection  with  a 
clock  synchronization  algorithm.  CF/ra  is  discussed  in  [Mahaney  &  Schneider  85],  who  were 
the  first  to  view  convergence  functions  (there,  called  inexact  agreement  protocols)  in  terms  of 
accuracy  and  predsion.  and  CF^^^  are  given  in  [Dolev  et  ai.  83];  CF^^^  is  the  basis  for 

the  dock  synchronizatian  protocol  of  [Lundulius  &  Lynch  84]. 

4.  Reading  Clocks  from  Afar 

Processors  have  access  to  dock  dme*— not  teal  tune.  This  means  chat  in  order  far  a  prD> 
oessor  p  to  read  virtual  docks  c^,  ...,  cy  at  the  same  teal  time,  p  must  read  ail  S  docks 
simultaneously.  This  is  impossible  because  a  processor  can  do  only  one  thing  at  a  time. 
Moreover,  message  passing  is  the  only  way  a  processor  can  obtain  a  dock  value  &om  aixitber 
in  a  distributed  system.  Message  delivery  times  are  typically  non>tiTvial  and  unpredictable. 
Thus,  it  is  impossible  for  a  single  processor  in  a  distributed  system  to  compute 
CF(cp(tT),  -  Cp(ty)  as  required  by  the  lesynchronization  protocol  outlined 

in  section  2. 

A  technique  originally  proposed  in  [Lamport  &  MUliar-Smith  85]  allows  one  prooessor  to 
compute  an  appraximatian  for  a  virtual  dock  at  another.  Each  prooessor  p  nMintaim  a  col¬ 
lection  of  tables  Tj,[l..Af]  containing  values  that  transform  Cp(t)  into  an  appraximatian  for 
Processor  p  approximates  cj(/),  by  Cp(t)+r‘p[q]. 

To  construct  r^,  p  periodically  communicates  with  the  other  processors  in  the  system. 
Suppose  the  minimum  delay  incurred  in  sending  a  message  between  any  pair  erf  cumxt  pro- 
ces:xn  is  and  F^  is  the  maxinnim  delay  incurred.  Thus,  F^-F^,^  is  the  uneertainty 
in  delivery  time  for  a  message.  A  pratesaor  p  <an  compute  by  executing 

■end  dodc  time?”  to  q- 

receive  C  finom  q  dmeoat  after  2r;,^+K; 

if  dmed-out  then  C  «; 

where  k  is  the  maTimiitti  length  at  rime  (according  to  p’s  clock)  it  can  take  q  to  process  the 


request  made  by  p  and  is  the  real  time  at  which  the  statement  assigning  to  rp[q]  is  exe¬ 
cuted. 

Hefine  the  clock  reading  error  \p(q)  to  be  the  error  in  p’%  approximation  of  q’s  vir¬ 
tual  clock,  and  let  A  be  the  nMTimum  clock  iea(£ng  error.  That  is, 

l«i(0  -  <^p(0-Tp[4ll  s  ^piq)  s  A. 

In  order  to  compute  a  bound  A  on  \piq),  first  note  that  p’i  approximation  of  ^’s  clock  drifts 
from  q’i  dock  by  at  most  2p  dock  seconds  per  real  second.  Initially,  rp[q]  is  in  error  by  at 
most  r^-r^  since  only  r,n„  of  the  message  delay  incurred  by  ^’s  response  to  p’s  request 
for  the  time  is  accounted  for  in  the  calculation  of  rp[q].  Thus,  at  time  t,  Xp(q)  satisfies 
^p(d)  ^  r^-r^-t-2p(t-Lreadp(q))  ^  A 

where  Lreadp(q)  is  the  real  time  that  p  last  executed  an  assignment  to  rp[q]  in  the  dock  read¬ 
ing  protocol  above.  Although  \p(q)  is  a  function  of  r,  an  upper  bound  on  t-Lreadp(q)  is 
usually  known,  and  therefore  A  is  a  constant. 

\piq)  can  be  kept  small  by  recomputing  frequently,  thereby  keeping  t-Lreadpiq) 
srmin-  In  practioe,  it  suffices  to  obtain  dock  values  firom  all  processors  just  before  computing 
FlXp~^,  because  this  minimireii  the  dock  reading  error  just  before  the  dock  values  are  actu¬ 
ally  needed.  However,  for  reasonable  intervals  t-Lreadp(q),  2p(t-Lreadp(q))«r^- 
so  minimizing  the  uncertainty  in  the  network  delay  is  the  key  to  reducing  \piq). 

A  variation  on  this  scheirr  [Lundulius  A  Lynch  84]  reduces  the  number  of  messages  by 
half  but  can  increase  the  clock  reading  error.  Instead  of  requesting  the  tune,  each  processor  q 
periodically  broadcasts  its  virtual  dock  value  (induding  the  superscript).  Upon  receipt  of 
such  a  message,  the  receiver  p  updates  rp[q\.  The  reduction  in  number  of  messages  sent  is 
due  to  lack  of  explidt  request  messages— -the  passage  of  time,  rather  than  a  request  message, 
causes  transmission  of  a  dock  value.  However,  in  a  paint-to-point  network,  dock  teaefing 
errors  can  increase  when  this  variation  is  used.  This  increase  is  because  a  processor  p  does 
not  necessarily  know  what  oammunications  line  it  should  monitor  for  the  next  dock  message. 
Polling  the  communications  lines  increases  the  uncertainty  in  message  delivery  delay  since  it  is 
possible  for  a  message  to  remain  queued  at  the  receiver  for  the  palling  cycle  time.  Most  local 
area  networks,  herwever,  have  a  single  connection  between  the  processor  and  the  network  and 
therefore  do  not  have  this  problem. 

5.  Improved  Coavergenoe  by  Exploiting  the  Network 

An  agreement  protocol  allows  oorrect  processors  in  a  dutributed  system  to  agree  on  an 
action  to  be  taken  or  on  a  set  of  values.  Use  of  an  agreement  protocol  to  (fisaeminate  a  signal 
that  causes  prooesson  to  resynebranue  docks  can  ensure  property  RTSl  at  a  reliable  time 
source.  Use  of  an  agreement  protoooi  to  dsaeminatB  each  processor's  dodc  can  enhance  the 


predaion  of  a  convergence  function,  hence  help  with  RTS2,  by  ensuring  that  corresponding 
argument  positions  are  equal  in  two  evaluadons  of  CF  performed  by  different  processors. 

Agreement  protocols  are  generally  intended  for  use  with  values,  not  functions  like 
docks.  The  general  structure  of  such  a  protocol  is  for  a  processor  to  send  a  copy  of  every 
value  it  receives  to  every  other  processor.  After  several  rounds  of  this  repeated  message 
exchange,  each  processor  selects  one  from  among  the  set  of  values  it  has  received.  The  cri¬ 
teria  for  selection  depend  on  the  agreement  protocol — use  of  median  or  mode  is  not  unusuaL 
The  relaying  of  messages  through  different  paths,  although  seemingly  inefficient,  is  a  neces¬ 
sary  and  important  part  of  most  agreement  protocols  because  it  prevents  correct  processors 
from  being  confounded  by  inconsistent  values  sent  by  faulty  processors. 

It  is  not  difficult  to  modify  an  agreement  protocol  intended  for  disseminating  values  to 
permit  processors  to  agree  on  docks:  dock  differences,  which  are  relativeiy  static,  are 
exchanged.  A  superscripted  virtual  dock  c\  is  stored  as  a  triple  {proc,  i,  offset)  which  specifies 
a  dock  with  offset  offset  from  the  virtual  dock  with  superscript  i  at  processor  proc.  (Note, 
proc  need  not  be  tl^  same  as  x.)  Thus,  <:[(/)  is  approxinsated  by  p  as  Cp(t)+Tplproc]-*‘offset. 
Processor  p  can  send  c,  to  another  processor  q  by  executing 

send  (proc,  1,  offset)  to  q  (5.1) 

and  q  can  receive  c'  by  executing 

receive  {proc,  epoch,  offset).  (5-2) 

Subsequently,  q  approximates  Cj  by  computing  c^(t)+T^jproc]-t-o#Jct. 

Because  c^-^T^[proc]  is  an  approximation  of  c^,  an  error  is  introduced  when  a  dock  is 
passed  from  one  processor  to  another  in  this  manner.  Consequently,  different  copses  of  a 
dock  received  by  a  single  processor  might  not  be  identicaL  Agreement  prrotocols  that  test  for 
equality  of  values  must  therefore  be  modified  to  handle  docks  passed  around  the  system  in 
this  fashion.  The  modification  involves  considering  two  values  equal  if  they  are  approxi¬ 
mately  equaL  Two  values  are  appseximately  equal  if  they  are  within  X^(proc)-»'X^(proc) 
where  p  first  converted  to  a  tripile  and  q  reconstructs  c'.  Values  are  tberefore  aprprroxi- 
mateiy  equal  if  they  are  within  2.\.  (Recall,  A  is  the  maTimntn  value  of  Xa(h)  for  any  pro¬ 
cessors  a  and  b.) 

5.1.  Crusader’s  Agreement 

Crusader's  Agreement  [Dolev  82]  allows  a  designated  processor,  called  the  treatsmitter,  to 
disseminate  a  value  in  such  a  way  that: 

CRUl;  All  correct  processon  that  do  not  “know”  chat  the  transmitier  is  faulty  agree  on 
the  vime  value. 


CRU2:  If  the  transmirter  is  correct,  th<^  all  correct  prtxrssors  agree  on  its  value. 


Thus,  Crusader's  Agreement  potentially  partitians  processors  intr?  three  classes:  those  that  are 
faulty,  thfwe  that  are  uonect  and  “know”  that  the  transmitter  is  faulty,  and  those  that  are 
correct  and  have  agreed  among  themselves  on  a  value  from  the  ones  sent  by  the  transmitter. 
Cnuader's  Agreemnt  is  simple  and  inexpensive  to  implement  in  a  distributed  system  where 
fewer  than  1/3  of  the  processors  are  faulty  and  reliable  communicaticns  is  possible.*  The  fol¬ 
lowing  2-round  protocol  for  Crusader's  Agreement  allows  clock  values  to  be  disseminated. 

(1)  The  transmitter  sends  its  dock  to  all  other  processors  using  (5.1). 

(2)  F-ach  processor  uses  (5.1)  to  send  the  dock  it  has  recdved  using  (5.2)  from  the 
transmitter  to  all  processors  (induding  itself). 

(3)  Farh  processor  sifts  through  the  dock*  it  received  in  step  (2)  to  identify  a  set  of  at 
most  iR  suspidous  processors  that,  if  &iulty,  could  account  for  differences  among  the 
values.  If,  after  ignoring  values  received  from  suspicious  processors,  the  differences 
in  the  values  that  remain  are  within  2A,  then  agree  on  the  dock  received  in  step  (2); 
otherwise,  dedde  that  the  transmitter  is  faulty. 

The  Crusader's  Convergence  Algorithm  CFcca  of  [Mahaney  &  Schneider  85]  is  the 
result  of  employing  Crusader's  Agreement  to  disseminate  values  before  applying  CFfCA. 
CFcca  t»as  half  the  precision  of  CF^ck  (i-c-  convergence  is  twice  as  good)  and  the  same  accu¬ 
racy  and  degree  of  fault  tolerance.  It  is  interesting  to  note  that  when  CFfCA  is  iterated 
twice— which  requires  the  same  two  rotmds  of  message  exchange  as  CFcca — ^  worst  case 
precision  is  48/9,  dearly  inferior  to  the  5/3  precision  achieved  when  Jie  two  rounds  of  mes¬ 
sage  exchange  is  used  for  a  Crusader's  Agreement.  Employing  Crusader's  Agreement  before 
CF^,  CF^i^  and  also  results  in  predsion  improvements  for  ±09e  convergence  func¬ 

tions. 

5.2.  Byzantine  Agreement 

Byzantine  Agreement  [Lamport  et  ai.  82]  is  stronger  than  Crusader's  Agreement — all 
correct  processors  agree  on  a  value  whether  or  not  the  transmitter  is  faulty: 

BYZl:  All  ojnect  processors  agree  on  the  same  value. 

BYZ2:  If  the  transmitter  is  correct  then  all  ainect  processois  agree  on  its  value. 

The  literature  contains  numerous  protocols  for  establishing  Byzantine  Agreement.  An  early 
survey  of  the  area  appears  in  [Fisher  83]  and  a  tutorial  in  [Schneider  85].  One  protocol  espe¬ 
cially  suited  for  use  in  local  area  networks  is  described  in  [Bahaoglu  &  Drummond  85].  See 

ccaBnunicaooiB  failure  can  ahvays  be  viewed  at  a  failure  of  either  the  or  receiving  precesaor. 

Aisuming  reliahle  message  ddivery  here  is  merely  an  apeaaar/  coowenicnce. 


[Lamport  &  Milliar-Smith  84]  for  an  example  of  one  of  the  claasic  protocols  in  action. 

For  use  in  a  convergence  function,  we  can  ignore  details  of  implementing  a  Byzantine 
Agreement  Protocol — it  suffices  to  know  what  it  achieves.  When  a  Byzantine  Agreement  is 
used  to  disseminate  clocks,  it  ensures  that  all  correct  processors  agree  within  2A  on  an 
approximacion  for  the  dock  at  each  processor.  Correct  processors  evaluating  a  convergence 
function  will  then  differ  by  at  most  2A  in  values  in  corresponding  argument  positions.  Define 
CFg  to  be  a  function  that  returns  its  largest  argument.  If  k<g<N-k  and  we  employ  a 
ByzantirK  Agreement  protocol  that  can  tolerate  k  failures  to  dissemmate  the  arguments  used 
in  CFg,  then  we  obtain  a  convergence  function  for  dock  synchronization; 

(1)  F-arh  processor  employs  the  Byzantine  Agreement  protocol  to  disseminate  its  dock. 

(2)  Farh  processor  then  uses  CFg  to  choose  as  its  new  clock  the  fastest  clock. 

To  see  why  this  works,  note  that  provided  there  are  ik  or  fewer  failures,  the  Byzantine 
Agreement  will  ensure  that  each  processor  p  obtains  a  vector  Vp[l]  through  of  the 
docks  at  other  processors.  Due  to  BYZ2,  if  ^  is  correct  then  Vp[4](r)  must  be  within  2.\  of 
Cgit).  Without  loss  of  generality,  assume  that  Vp[l](t)>Vp[2](r)>  •  •  •  >Vp[^f](/).  According 
to  BYZl,  (vp[g](/)-v^[gJ(/)|^2.\  far  all  camel  processors  p  and  Thus,  by  selecting  the 
g'^  largest  dock,  we  are  guaranteed  that  the  dock  selected  by  each  processor  reads  within 
€=2A  of  the  dock  selected  by  every  other.  This  means  that  the  predsion  of  the  algorithm  is 
^(5,  e)=2A — the  predsion  for  the  convergence  function  is  independent  of  8!  To  bound  the 
accuracy,  note  that  because  k<g<N—k,  the  g'*  largest  dock  lies  between  correct  dorks.  If 
coiTect  docks  are  within  8,  then  the  new  dock  is  no  more  than  8  away  from  a  correct  dock, 
so  we  condude  that  the  accuracy  of  the  algorithm  is  a(8)= 1. 

Qock  synchronizatian  algorithtrs  based  on  Byzantine  Agreement  are  described  in  [Larxh 
port  &  Milliar-Smith  84]  and  analyzed  in  [Lamport  &  MUliar-Smitb  85]. 

5.3.  An  Optimization 

The  cotrvergence  function  in  the  preceding  section  involves  Byzantine  Agreements  for 
values  that  are  not  needed;  all  the  docks  are  disseminated,  but  only  the  largest  g-f  1  are  used. 
(Only  the  g  1^  largest  dock  is  used  for  resynchrorrization,  but  to  determine  which  clock  is 
the  g-hl”  largest,  the  g  largest  docks  are  needed.)  Sirwe  Byzantine  Agreement  protocols 
can  be  costly — in  both  time  and  number  of  messages  exchanged— avoitfing  urmecessary 
Byzantine  Agreements  is  prudent.  We  therefore  propose  a  somewhat  weaker  form  of  agree¬ 
ment  to  take  the  place  of  the  Byzantine  Agreements  used  above  and  use  it  only  for  those 
clocks  that  are  actually  needed. 

A  Fireworks  Agreemettt  allows  a  collection  of  prooessors  each  with  a  value  v  to  accept 
messages  with  that  value  at  about  the  same  tune: 


*•*  *  ^  *  ' 
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FW:  All  correct  processors  accept  a  message  with  value  v  within  0  real  seconds  of 

each  other. 

The  thing  being  agreed  on  in  a  Fireworks  Agreement  is  the  real  time  that  a  value  is  accepted; 
not  the  value  itself.  The  name  Fireworks  Agreement  is  in  analogy  with  a  public  fireworks 
display — all  participants  agree  on  when  the  display  is  over.  la  a  fireworks  display,  0  is  non¬ 
zero  if  observers  are  different  distances  from  the  pyrotechnics;  in  a  distributed  system,  0  will 
be  related  to  the  variance  in  message-delivery  times. 

In  describing  a  protocol  to  implement  Fireworks  Agreement,  we  assume  that  it  is  possi¬ 
ble  for  a  correct  processor  to 

Al:  authenticate  the  origin  of  every  message  it  receives  and 

A2:  to  determine  whether  a  message  it  receives  was  nxxiified  by  processors  that 

relay  the  message. 

These  assumptions  are  satisfied  if  digital  signatures  are  employed  by  the  sender  of  a  message 
or  if  there  is  a  direct  link  between  every  pair  or  processors  and  the  simulated  authentication 
technique  of  [Srikanth  &  Toueg  84]  is  used  to  transmit  messages.  In  either  case,  faulty  pro¬ 
cessors  ate  unable  to  masquerade  as  correct  processors  by  sentfing  messages  and  are  unable  to 
moctify  messages  sent  originally  by  conect  processors  before  retiansmitting  them. 

The  following  protocol  implements  a  Fireworks  Agreement  with  for  use  in 

dock  synchronization  in  a  system  containing  virtual  docks  satisfying  (2.3).^  The  agreement  is 
for  a  message  with  value  T+a,  which  will  be  the  value  virtual  docks  have  when  the  protocol 
terminates  and  is  started  by  a  processor  when  its  dock  reaches  T,  the  a  priori  designated  time 
for  the  next  dock  synchronization. 

(1)  When  Cp(r)=r,  a  processor  p  broadcasts  (r+a, /7>  to  all  other  processors. 

(2)  Upon  receiving  (T+a,  q)  tfirectly  from  a  processor  q,  a  processor  p  relays  (T+a,  q) 
to  all  processors. 

(3)  Upon  receiving  values  {T+a,pi},  ...,  {T+a,p^^i)  where  for  i*J:  If  the  last 

message  received,  (T+a,  was  received  directly  from  p^.i,  then  delay 

and  accept  T-^a.  U  the  last  message  received,  {T-*-a,  was  not  received 

directly  from  Pt^i,  then  accept  T-t-a  immediatdy. 

Assumptions  Al  and  A2  make  it  impossible  for  faulty  processors  to  fool  uunect  processors 
that  are  trying  to  determine  the  origin  of  a  message  or  whether  the  message  was  relayed  as 
required  by  steps  (2)  and  (3)  of  the  protocoL  Steps  (1)  and  (2)  of  the  pratooal  together 
ensure  that  a  value  received  by  any  correct  processor  is  received  by  every  conect  processor. 

^Rflcail,  is  Che  minmiinn  message  delivery  rime  and  the  maiimiBn  message  ddiveiy  tme: 


Seep  (3)  ensures  that  a  value  is  accepted  by  all  processors  within  3  real  seconds  of  each  other. 
Moreover,  because  there  are  at  most  k  faulty  processors,  step  (3)  ensures  that  a  value  is 
accepted  only  after  that  value  has  been  received  from  some  correct  processor. 

When  Fireworks  Agreement  is  used  in  constructing  a  convergence  function  CF  to  imple¬ 
ment  a  reliable  time  source,  the  value  of  CF  is  the  time  the  T message  is  accepted.  Let  tp 
be  the  real  time  the  message  is  accepted  by  processor  p  and  let  be  the  real  time  the  message 
is  accepted  by  processor  q.  Due  to  FW,  evaluations  of  the  convergence  function  at  correct 
processors  p  and  q  can  differ  by  at  most  3  =  Thus,  setting  Cp~^{tp)  =  T+a  and 

Cg'*(f,)  =  r-t-«j  satisfies  the  Precision  Enhancement  Property,  with  ir(8,  e)  =  (l-t-p)3.  Accu¬ 
racy  a  of  the  Accuracy  Preservation  Property  is  given  by  a(8)  =  (8+2r„^(l-t-p)-a  because 
in  the  worst  case  S  seconds  can  elapse  between  when  the  first  correct  processor  teaches  T  and 
broadcasts  its  message  and  when  the  ik  +  Lit  processor  broadcasts  its  mes.sage,  followed  by  an 
additional  seconds  for  the  protocol  to  complete. 

Qock  synchronization  algorithms  baaed  on  Fireworks  Agreement  are  interesting  because 
a  processor  cannot  evaluate  CF  without  causing  every  other  correct  processor  to  tesynchron- 
ize  its  dock.  Thus,  the  convergence  function  provides  an  implementation  of  both  RTSl  and 
RTS2;  the  other  convergence  functiona  di.sciis.sed  in  this  paper  provide  an  implementation  of 
only  R1S2. 

The  first  dock  synchronization  protocol  to  be  baaed  on  Fireworks  Agreement  is  dis¬ 
cussed  in  [Halpem  et  al.  84].  A  more  recent  algorithm  [Srikanth  &  Toueg  85]  implements 
virtual  dodts  with  rates  much  doser  to  the  rate  of  the  hardware  docks  on  which  they  are 
based. 

6.  Discussion  and  Conclusions 

We  have  rfisfiiwH  dock  synchronization  protocols  that  can  be  viewed  as  refinements  of 
a  single  paradigm.  The  paradigm  is  based  on  postulating  a  reliable  tune  source  that  periodi¬ 
cally  issues  messages  U)  cause  processors  to  synchronize  their  docks.  The  reliable  time  source 
is  implemented  by  evaluating  a  convergence  function  on  the  values  of  processor  docks.  Thus, 
if  processor  docks  run  dose  together  but  far  from  real  time,  docks  implemented  by  an  algo¬ 
rithm  based  on  this  paradigm  will  remain  synchronized  with  each  other  but  will  diverge  from 
the  real  dme. 

In  order  to  construct  a  dock  synchronization  algorithm  that  keeps  docks  dose  to  real 
dme,  the  reliable  time  source  must  remain  close  to  teal  dme.  Various  intemadonal  standards 
organizadons  maintain  highly  accurate  synchronized  docks.  In  the  United  States,  WWV 
radio  broadcasts  at  60  KHz  provide  a  dme  signal  accurate  to  a  few  milliaeoonds,  as  does  the 
GEOS  satellite.  (WWV  broadcasts  at  S,  10,  and  IS  MHz  are  accurate  to  only  100  mil¬ 
liseconds,  due  to  uncertainty  in  propagation  delays.)  Eni{doying  raefio  teceiven  jo  inject 


such  conect  real  tiriMa  into  a  distributed  system  is  one  way  to  provide  the  needed  source  of 
rimf-  Algorithms  for  dock  synchronizatian  when  an  external  source  of  dme  is  available  are 
described  in  [MarzuUo  &  Owidd  83],  [Marzullo  84],  and  [Lamport  85]. 

The  fact  that  so  many  dfrk  synchronization  algorithms  can  be  viewed  in  terms  of  a  sin> 
gle  paradigm  camr  as  a  bit  of  a  surprise.  Previously,  clock  synchronization  algorithms  were 
viewed  in  terms  of  three  daw;*-  those  based  on  convergerHX,  those  based  on  agreement,  and 
those  in  the  style  of  [Halpem  et  ai.  84].  It  was  pleasing  to  (fiscover  that  all  the  published 
algorithms  can,  in  fact,  be  viewed  in  terms  of  a  single  paradigm  based  on  convergence  func> 
dons.  In  addition,  viewing  algorithms  as  refinements  of  a  single  paradigm  allows  their  perfor¬ 
mance  to  be  compared.  Performance  of  a  clock  synchronization  algorithm  based  on  conver¬ 
gence  functions  is  characterized  by  ir,  a,  and  the  cost  of  computing  the  underlying  camrer- 
gence  function.  Thus,  by  defining  the  notion  of  a  convergence  function  and  giving  a  frame¬ 
work  in  which  its  performance  can  be  quantified,  we  have  made  it  possible  to  compare  exist¬ 
ing  algorithms  as  well  as  given  insight  into  the  construction  of  new  algorithms. 


AcknowtHlgniiiiHs 

TJucmaiam  with  Qralp  Babaogtu,  Steve  Mahaoey,  Leslie  Lamport,  and  Sam  Toueg  have  been  bdpfuL  In 
adcfitian,  I  am  grateful  to  Qzalp  Babaoglu,  David  Gries  and  Jacob  Aizilcowiiz  for  useful  comiiinm  on  an  early 
version  of  this  paper.  The  diagram  in  the  appends  was  promptly  and  expertly  prepared  by  Lori  Dyess.  The 
ootians  of  acxuracy  and  precisian  were  develof^  jointly  with  Steve  Mahaoey  tinder  a  camulting  agreement  with 
AT&T  Ben  Laborataries. 


ReCerenoeB 

[Babaoglu  &  Drummond  8S]  Babaoglu,  O.  and  R.  Drummond.  Streea  cf  Byzantium:  Network  ardtitaemres  for 
fast  reliable  broa<±asti.  IEEE  Trou.  m  Sefivtmt  fjigineerieg  5E~ll,  6  (June  198S), 

[Dolev  S2]  Dolev,  D.  The  Byzantine  Geaerals  strike  again.  Jamvoi  AlgariOmu  3  (1982),  14-30. 

fDolev  et  ai.  83]  Dolev,  D.,  N.A.  Lynch,  S.S  Pfaiter,  E.W.  Stark,  and  W.E.  WeihL  Reaching  apprazimate 
agreement  in  the  presooe  of  faults.  Proe.  Third  SympoiUm  on  Xriiadiiity  in  Dinhhmrd  Stftwme  and 
Dauibase  Systems,  Oct.  1983,  IEEE  Computer  Sodecy,  143-154. 

[Fisher  83]  Fischer,  M.  The  mnsmiis  problem  in  unreliable  dsthbuted  systems  (a  brief  survey).  Prae.  Imemn- 
tumai  Catierenee  an  Fotatdadans  Camptuasian  Theory,  Barghohn,  Sweden.  August  1983. 

[Halpem  et  aL  84]  Halpem,  J.,  B.  Sboons,  R.  Strong,  and  D.  Dolev.  Fauit-toleram  clock  synchronzatian. 

Pne.  of  the  Third  ACM  SKjACTSIGOPS  Sympasaon  an  Priittipks  DIsaibneed  Cmyarfag,  Varmouver, 
Canada,  August  1984,  89-102. 

[Lamport  84]  Lamport,  L.  Using  riww  instewl  at  tixaeoui  for  fault-tolerance  in  dbttibuted  systems.  ACM 
TOPLAS  6,  2  (April  1984),  254-280. 

[Lamport  85]  Lamport,  L.  Notes  on  a  ri«wg  service:  fteliminary  Report,  OECSRC,  Palo  Alto,  CA,  Nov.  1983. 

[Lamport  &  Milliar-Smith  84]  Lamport,  L  and  P.M.  MHliar-Saaith.  Byzantine  clock  synchronizatian.  Pne. 

Third  ACM  SIGACTSCOPS  Syntposiim  an  Prineipks  Distritmed  Canptsditf,  Vaneoavs,  Canada, 
August  1984,  ii8-74. 


[Lamport  &  Vfil''-v-Snth  85]  Lamport,  L.  and  P.M.  MiEiar-Smith.  Synchromzing  clocks  in  the  presence  of 
faults.  J.  ACMJ2,  1  (Jan.  1985),  32-78. 

[Lamport  et  ai.  82]  Lamport,  L,  R.  9iosuk,  and  M.  Pease.  The  byzamine  generals  problem.  ACM  TOPLAS  4, 
3  (July  1982),  382-401. 

[Lundelius  &  Lynch  84]  Lunddius,  J.  and  N.  Lynch.  A  new  fault-toleram  algorithm  for  clock  synchronization. 

Proe.  of  the  Third  ACM  SKtACTSlCOPS  Symptaiim  at  Phncipkt  <f  Dittribued  Compuang,  Vancouver, 
Canada,  August  1984,  73-88. 

[Mahaney  A  Schneider  83]  Mahaney,  SR.  and  F.B.  Schneider.  Inecact  agreement:  Axuracy,  predsaon,  and 
graceful  degradadon.  Proe.  the  Faerth  ACM  SICACT SICOPS  Sympoeitm  an  Ptineipiet  <4  Duaibnied 
Canpudng,  VGnald,  Ontario,  Canada,  August  1983,  237-249. 

[MarzuUo  &  Owicld  83]  MatzuUo,  K.  and  SS  Owicld.  Mamtaining  the  time  in  a  (fisiributed  system  Proe.  of 
the  Seeand  .\CM  SICACT SIGOPS  Sympotuem  an  Principies  of  Disaihnied  Can^nuiiig,  Momreal,  Quebec, 
Canada,  August  1983  ,  293-303. 

[MarzuUo  84]  MarzuDo,  K.  Maintaining  the  time  in  a  dUtributed  system  An  example  of  a  looseiy-caupled  (fis- 
tributed  service.  Ph.D.  Thesis,  Department  of  Electrical  Engineering,  Stanford  University. 

[Mills  83]  Mills,  D.L.  Experiments  in  network  clock  synchramzanon.  ARPANet  RPC937,  Sept  1983. 

[Schneider  83]  Schneider,  F.B.  Paradigms  for  dstributed  programs.  In  Ditirilmied  Synemi.  .Veihodt  wtd  TooU 
far  SpeeifSeadan,  M.  Paul  and  HJ.  Siegert,  et^  Lecture  Notes  in  Computer  Science,  Vol.  190, 
Springer-Verlag,  Berlin,  1983,  432-443. 

[Srtkanth  A  Toueg  84}  Srikanth,  T.K.  and  S  Toueg.  Smulacing  authenticated  broadcasts  to  derive  simple 
fault- tolerant  algorithms.  Technical  Report  TR  S4-ti23,  Department  of  Computer  Science,  Cornell 
Univesity,  Ithaca,  New  York,  July  1984. 

[Srikanth  A  Toueg  83]  Srikanth,  T.K.  and  S.  Toueg.  Optimal  clock  synchronization.  Proe.  of  the  Fatanh  ACM 
SICACT-SIGOP5  Symposiim  an  Principies  of  Disiribnied  Campndng,  hfinald,  Ontario,  Canada,  August 
1983,  71-86. 


Appendb:  Rcsyncliroiiization  Interrai 

The  maTimuTti  interval  that  can  elapse  before  starting  a  new  virtual  clock  depends  on  the 
maximuiD  rate  at  which  virtual  docks  drift  apart,  how  dcsely  virtual  docks  are  synchronized, 
and  the  predsion  and  accuracy  of  the  oonvergeace  function  being  used.  In  this  appendix,  we 
give  the  precise  relationship  between  these  parameters. 

Nonce  that  in  the  dock  synchronization  protocol  of  section  2,  is  oomputed  using 
virtual  docks  cj,  for  all  processors  q.  Thus,  we  require 

Concnrrent  Clocks  Property:  cj,  must  have  been  started  at  every  processor  p  if  has 
been  started  at  any  processor  q. 

Let  ^  ^  miTiiTTOtwi  clock  time  that  can  elapse  between  successive  dock  resyndnoniza- 
tions  by  any  processor.  If  virtual  docks  are  synchronized  to  within  S,  then,  provided 


the  Concurrent  Qocks  Property  will  bold. 


(7.1) 


From  the  Concurrent  Gocks  Property,  we  conclude  that  the  n':mber  of  virtual  clocks 
that  have  been  started  by  each  processor  can  differ  at  most  by  1  at  any  real  rime  /.  This 
allows  synchronization  requirement  on  virtual  clacks  (2J)  to  be  lefarmulated  in  terms  of 
superscripted  virtual  clocks.  Let  be  the  real  time  that  superscripted  virtual  clock  cj,  is 
started  by  processor  p.  Then  we  have, 

^  ^  for  tjf)  ^  t  <  max(tj,"K  4*^)  C7.2) 


|cr'(0-4(-')l  S  for 


t  <  tjr' 


Let  be  the  maTimiiTn  real  rime  that  can  elapse  before  clock  lesynchronization  is 
necessary  to  preserve  (7.2)  and  (7  J).  Consider  a  processor  p  with  a  virtual  clock  imple¬ 
mented  by  Cp~^  that  is  running  (slow)  at  1— p  clock  seconds  per  second  and  a  processor  q 
with  a  virtual  clock  implemented  by  that  is  running  (fast)  at  1+p  clock  seconds  per 
second.  (See  Figure  1.)  Now,  suppose  p  is  the  last  processor  to  start  its  clock,  ^  is  the 
first,  and  that  at  real  rime  when  p  starts  Cp, 


Due  to  the  definition  of  R„ 
at  real  time 


,  p  will  start  at  real  time  and  q  will  start 

1*1  _  /  . 


because 

=  c;(4)+8;+(4*i-<'p)(l+p)  s  £j,itj,)+R^l-p) 

]f  at  rime  correct  virtual  clocks  (with  supenchpt  i)  are  in  a  wide  interval,  then 
due  to  the  Accuracy  Preservation  Property,  starting  ^  results  in  a  virtual  clock  that  can  be 
as  much  as  a(5^"^-f  A)  from  any  correct  virtual  clock  with  superscript  i,  because  we  (pessim¬ 
istically)  assume  that  q  approximates  ail  clocks  high  by  maxirmim  dock  reading  error  A.  In 
the  worst  case,  ^  will  continue  to  run  as  fast  as  possible,  so  by  could  be  as  much  as 
A)+2p(i^'^^-/^'*'^)  away  from  cj,.  Therefore,  to  satisfy  requiremeht  (7J),  we  must 

have 

a(8j"i+A)+2p<4‘l-4*i)  s  8.  (7.5) 

And,  to  satisfy  requirement  (7.2),  we  must  have 

8'+2^,„  IS  8.  (7.6) 
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Figure  1.  Cock  Resyodiranizadca  Soenaho 

All  that  remaios  is  to  detennine  aod  5^*^^  Siace  at  real  tune  the  virtual  docks 
at  correct  processon  are  Sj,  apart,  by  they  can  be  as  much  as  2p(4*^-<^)+8p  apart. 
Substitutiag  for  baaed  on  (7-4),  we  get 


Finally,  ij,  is  defined  inducthrely  as  follows.  For  the  first  epoch,  which  starts  at  real 
rime  0  and  implements  virtual  dodcs  superscripted  by  1,  we  have  8^^^,  due  to  initial  value 
contfition  (2.1)  and  tlK  definition  of  FIXS  in  the  protocol  of  section  2.  For  the  i+1^  epoch. 


we  have  8^“ 
is  achieved. 


We  now  consider  two  cases,  depending  on  how  RTSl 


The  fint  we  consider  is  where  prooesson  resynchronize  their  clocks  when  the  previ* 
ous  superscripted  virtual  clock  reaches  some  given  value.  The  worst  case  is  when  once 
started,  continues  to  run  as  fast  as  possible,  in  which  case^ 

=  c/!-(c‘(4"i).ci(4*i)+x,(i) . 

By  definition, 

. + 

Thus,  we  have 

8-1  =  lC/-(c'(4-i),  c'(4*l)+X,<l),  ...,  c^4*l)+X,(^)  + 

-  CFicj,(tjr^),  ci(/^*l)+Xp(l),  ...,  cjU/^/l)+X/JV))l 

-  icf (c‘(4*i)+(4*i-4’i)(i+p). . 

<V(4"')+x,(iv)+(4-i-4"1)(i+p)) 

due  to  translation  invariance.  Sines  the  clock  teaefing  error  for  correct  processors  is  bounded 
by  A,  the  value  of  each  argument  (in  the  second  evaluation  of  CF)  cl,(t^p*^)+\p(a)  for  any 
correct  processor  a  satisfies  the  following  inequality: 

Thtis,  the  differenoe  between  an  argument  in  the  second  evaluation  of  CF  and  the 
corresponding  argument  ea(/^'^i)+X^(/)+(i^*i~/^*i)(l+p)  in  the  first  evaluation  of  CF  is 
bounded  by  2p(r^"'i-r^'^i)+A.  Provided  CF  has  sufficient  fault  tokerance  degree  to  cope  with 
faulty  processors,  we  can  use  the  Ptedsion  Enhaixsment  Property  of  CF  with  8-S  due  to 
(7.5)  and  €=2p(/^*i-t^*i)+A  to  conclude  that 

8'"i  s  ir(«,  2p(4"i-4^i)-hA). 

The  second  case  we  consider  is  where  all  piocessors  icsynchranize  their  clodts  within  0 
real  seconds  and  all  start  their  new  clocks  at  a  ghren  value  T+a.  This  case  corresponds  to  the 
use  of  a  Fueworks  Agreement  and  is  mudi  simpler  than  the  previous  one.  By  definition, 
|/— Because  q  can  run  as  long  as  0  seconds  before  the  new  dock  at  p  cj,*^ 
is  started,  can  be  as  large  as  r+a'*'(I+p)0.  Thus,  we  have  9p*^(l+p)fi  because 

both  Cp*^  and  start  with  value  T+a. 

Putting  this  all  together,  the  interval  R  in  teal  seconds  between  clock  tesynchranizations 
must  satisfy  R^R„m  where  R^  satisfies  (7J)  and  (7.6).  Snee  virtual  docks  do  not  neces¬ 
sarily  run  at  1  dock  second  per  second,  the  resynchronizatian  interval  RI  in  dock  seconds 


*RaEaa,  X,(v)  is  the  error 


iated  with 


4  rewiiii  the  dock  at 


V. 


'iwl  by  every  processor  must  satisfy  Rl/(l+p)^R^  so  that  the  fastest  processor  does  not 
exceed  the  bound.  Combining  this  with  the  lower  bound  for  lU  given  by  (7.1),  we  get 


S  < 


RI 

(1+p) 


<  Rmar- 


(7.7) 


Virtual  Clock  Rates 


Simply  setting  a  clock  ahead  or  back  in  order  to  maintain  synchronization  with  other 
r!nrk<  ran  cause  problems.  In  real-ome  process-control  applications,  tasks  are  broken  into 
<Tn.-.11  computations  and  based  on  clock  readings  to  ensure  that  teal-time  dradlinrs 

can  be  met.  If  a  dock  synchronizatioo  protocol  suddenly  sets  a  clock  forward,  the  processor 
might  not  be  able  to  handle  all  the  ta»k«  that  have  become  due.  In  other  applications,  dock 
rimi^  are  to  infer  possible  causality  between  events.  For  example,  creation  times  for 
files  are  usually  taken  to  define  the  order  in  which  the  files  were  created.  Suddenly  setting  a 
dock  back  can  destroy  the  consistency  of  time  with  potential  causality.  Fmally,  when  docks 
are  n«efi  to  obtain  performance  measurements,  a  sudden  shift  in  the  dock  value  can  introduce 
errors  by  the  amount  of  the  shift. 


For  these  reasons,  a  dock  synchronization  protocol  must  satisfy  a  rate  restriction  like 
(2.4),  which  prevents  the  value  of  the  dock  from  changing  by  too  large  an  amount  over  too 
short  an  intervaL  One  way  to  satisfy  (2.4)  is  to  indude  as  part  of  a  time  value  the  superscript 
of  the  virtual  dock  that  furnished  that  value  and  chocae  p  such  that  p^p.  According  to 
(2.2),  docks  at  conect  processors  run  at  a  rate  between  1-p  and  1+p.  Thus,  dock  values 
with  the  same  superscript  can  be  compared  and  manipulated  because  they  were  obtained  from 
a  set  of  docks  satisfying  (2.4).  Clock  values  with  different  superscripts,  however,  do  not 
have  this  property.  These  values  are  incomparable  because  of  the  discontinuity  when  a  new 
virtual  dock  is  started.  This  is  an  obvious  limitation  of  the  scheme,  since  time  values  that  are 
far  apart  are  likdy  to  have  come  from  virtual  docks  with  different  superscripts. 


A  second  way  to  satisfy  (2.4)  is  by  evenly  spreading  any  change  between  FlXp~^  and 
FlXp  over  the  entire  i'*'  epoch.  Instead  of  making  an  instantaneous  shift  in  the  value  of  Cp 
when  Cp  is  started,  the  dock  drift  rate  is  mndifii^  to  compensate  for  the  change.  According 
to  (7.7),  an  epoch  lasts  at  most  RI  clock  seconds.  Thus,  we  implement  Cp  by  incrementing 
c'p~^  by  hckp  whenever  Cp  is  incremented. 


(FOi-Fajr') 

a 


The  drift  of  Cp  due  to  this  compensation  can  be  computed  as  follows.  According  to  the 
Accuracy  Preservation  Property,  a  clock  value  can  be  shiftrd  by  at  most  a(ft)  when  it  is 
resynchronized  provided  correct  processors  lie  within  an  interval  of  width  8.  Sinnr  (2J) 


ensures  that  any  two  correct  docks  are  within  we  condude  that  a  correct  dock  at  processor 
p  can  he  shifted  by  at  most  a(5),  and  therefore 

0  <  1/ick^l 

According  to  (2.2),  the  rate  of  a  correct  processor  (hardware)  dock  is  between  1-p  and  1+p. 
Adding  the  compensadoo  due  to  tidtp,  we  find  that  the  rate  of  £„  must  be  between 

1 — p  —  and  1 + p + ”^7^  •  Thus,  if  p  satisfies 

RJ  -Rr 

1-p  s  ^  ^ 

then  (2.4)  will  hold. 


